A New Hybrid Schemes Combining Ontology and Clustering for Text Documents
نویسندگان
چکیده
منابع مشابه
Approaches to Ontology Based Algorithms for Clustering Text Documents
The advancement in digital technology and World Wide Web has increased the usage of digital documents being used for various purposes like epublishing, digital library. Increase in number of text documents requires efficient techniques that can help during searching and retrieval. Document clustering is one such technique which automatically organizes text documents into meaningful groups. This...
متن کاملText Mining with Hybrid Clustering Schemes
Hybrid information retrieval (IR) schemes combine di erent normalization techniques and similarity functions. Hybrid schemes provide an eÆcient technique to improve precision and recall (see e.g., [4]). This paper reports a hybrid clustering scheme that applies a singular value decomposition (SVD) based algorithm followed by a k{means type clustering algorithm. The output of the rst algorithm b...
متن کاملA New Approach for Text Documents Classification with Invasive Weed Optimization and Naive Bayes Classifier
With the fast increase of the documents, using Text Document Classification (TDC) methods has become a crucial matter. This paper presented a hybrid model of Invasive Weed Optimization (IWO) and Naive Bayes (NB) classifier (IWO-NB) for Feature Selection (FS) in order to reduce the big size of features space in TDC. TDC includes different actions such as text processing, feature extraction, form...
متن کاملEnhancing Traditional Text Documents Clustering based on Ontology
Ontologies currently are a hot topic in the areas of Semantic Web. The current clustering research emphasizes the development of a more efficient clustering method and mainly focuses on term weight calculation without considering the domain knowledge. This paper investigates how ontologies can also be applied to the clustering process. To complement the traditional clustering method, more infor...
متن کاملClustering Full Text Documents
An index or topic hierarchy of full-text documents can organize a domain and speed information retrieval. Traditional indexes, like the Library of Congress system or Dewey Decimal system, are generated by hand, updated infrequently, and applied inconsistently. With machine learning, they can be generated automatically, updated as new documents arrive, and applied consistently. Despite the appea...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Information Technology Journal
سال: 2013
ISSN: 1812-5638
DOI: 10.3923/itj.2013.2447.2453